Splitting Data Objects to Increase Cache Utilization ( Preliminary
نویسندگان
چکیده
We present a technique to increase data cache utilization of pointer-based programs. These caches are often underutilized when an algorithm accesses certain data-members much more frequently than others. In programs that allocate large numbers of identical objects, this can result in some cache lines becoming very busy, whereas others are left virtually untouched because ÒcoldÓ program data is mapped onto them. Our solution consists of grouping all the frequently accessed data members together and physically separating them from the less frequently accessed ones. Each object gets split into a ÒhotÓ part and zero or more ÒcoldÓ parts of equal size, which are evenly spaced from each other in memory, facilitating simple address calculation. The hot parts of different objects can then be allocated contiguously, evenly tiling the data cache and thereby increasing its utilization. Our technique is fully automatic, based on profiling, and extends earlier work on automated cache-conscious storage layout of dynamically allocated data structures. It is applicable to all type-safe programming languages that completely abstract from physical storage layout; examples of such languages are Java and Oberon.
منابع مشابه
Delayed Popularity-Aware Web Proxy Caching Algorithms
World Wide Web has been a very successful distributed system that distributes and shares information on the Internet. Caching objects close to the user community provides an opportunity to allow other users to get the same objects quickly. Although caching can reduce the delays of retrieving popular objects, the utilization of the disk space in the caching servers is usually very low. Low disk ...
متن کاملDiierential Multithreading: Recapturing Pipeline Stall Cycles and Enhancing Throughput in Small-scale Embedded Microprocessors
This paper presents Diierential Multithreading (dMT) as an inexpensive way to achieve high through-put from a single-issue architecture. dMT switches among multiple instruction streams in response to pipeline stall conditions but saves in-ight instructions, thus squashing pipeline bubbles and ensuring maximal utilization of a single pipeline. dMT uses auxiliary pipeline registers to save the st...
متن کاملTowards Enabling Low-Level Memory Optimisations at the High-Level with Ownership-like Annotations
In modern architectures, due to the huge gap between CPU performance and memory bandwidth, an application’s performance highly depends on the speed at which the system is able to deliver data to operate on. The placement of data in memory affects the number of cache misses, and thus the overall speed of the application. To address this, pooling and splitting are two techniques that allow to gro...
متن کاملScalable Web Caching of Frequently Updated Objects Using Reliable Multicast
Frequently updated web objects reduce the benefit of caching, increase the problem of cache inconsistency, and aggravate the inefficiency of the conventional "repeated unicast" delivery model. In this paper, we investigate multicast invalidation and delivery of popular, frequently updated objects to web cache proxies. Our protocol, MMO, groups objects into volumes, each of which maps to one IP ...
متن کاملImprove Prefetch Performance by Splitting the Cache Replacement Queue
The performance of a prefetch cache is dependent on both the prefetch technique and the cache replacement policy. Both these algorithms execute independently of each other, but they share a data structure the cache replacement queue. This paper shows that even with a simple prefetch technique, there is an increase in hit rate when the LRU replacement queue is split into two equal sized queues. ...
متن کامل